OpenCL on FPGAs for GPU Programmers
ثبت نشده
چکیده
Data Parallelism and Kernels Data parallelism is a form of parallelism across multiple processors that is achieved when each processor performs identical tasks on different pieces of distributed data. Data-parallel portions of an algorithm are executed on devices as kernels, which are C functions with some restrictions and a few language extensions. The host launches kernels across a 1D, 2D, or 3D grid of work-items to be processed by the devices. Conceptually, work-items can be thought of as individual processing threads, that each execute the same kernel function. Work-items have a unique index within the grid, and typically compute different portions of the result. Work-items are grouped together into work-groups, which are expected to execute independently from one another.
منابع مشابه
OpenCL-based optimizations for acceleration of object tracking on FPGAs and GPUs
OpenCL support across many heterogeneous nodes (FPGAs, GPUs, CPUs) has increased the programmability of these systems significantly. At the same time, it opens up new challenges and design choices for system designers and application programmers. While OpenCL offers a universal semantic to capture the parallel behavior of applications independent of the target architecture, some customization s...
متن کاملCombined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL
Recent developments in High Level Synthesis tools have attracted software programmers to accelerate their high-performance computing applications on FPGAs. Even though it has been shown that FPGAs can compete with GPUs in terms of performance for stencil computation, most previous work achieve this by avoiding spatial blocking and restricting input dimensions relative to FPGA on-chip memory. In...
متن کاملHigh-performance Dynamic Programming on FPGAs with OpenCL
Field programmable gate arrays (FPGAs) provide reconfigurable computing fabrics that can be tailored to a wide range of time and power sensitive applications. Traditionally, programming FPGAs required an expertise in complex hardware description languages (HDLs) or proprietary high-level synthesis (HLS) tools. Recently, Altera released the worlds first OpenCL conformant SDK for FPGAs. OpenCL is...
متن کاملLoop2GPU: Transforming Loops to OpenCL Kernels as a LLVM Pass
Lately, programmers have started to take advantage of the GPU capabilities of their systems. Still, programming for the GPU can be very hard. We are trying to hide some of this complexity from the programmer by making the compiler automatically transform embarrassingly parallel loops to GPU kernels. To this end, we have implemented a compiler pass that transforms simple loops to OpenCL kernels.
متن کاملEnergy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL
Modern SoCs are getting increasingly heterogeneous with a combination of multi-core architectures and hardware accelerators to speed up the execution of computeintensive tasks at considerably lower power consumption. Modern FPGAs, due to their reasonable execution speed and comparatively lower power consumption, are strong competitors to the traditional GPU based accelerators. High-level Synthe...
متن کامل